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Abstract. In this paper the Neciporuk method for proving lower bounds on the size of Boolean 
formulae is reformulated in terms of one-way communication complexity. We investigate the sce- 
narios of probabilistic formulae, nondeterministic formulae, and quantum formulae. In all cases we 
can use results about one-way communication complexity to prove lower bounds on formula size. In 
the latter two cases we newly develop the employed communication complexity bounds. The main 
results regarding formula size are as follows: A polynomial size gap between probabilistic/quantum 
and deterministic formulae. A near-quadratic size gap for nondeterministic formulae with limited 
access to nondeterministic bits. A near quadratic lower bound on quantum formula size, as well as a 
polynomial separation between the sizes of quantum formulae with and without multiple read ran- 
dom inputs. The methods for quantum and probabilistic formulae employ a variant of the Neciporuk 
bound in terms of the VC-dimension. Regarding communication complexity we give optimal separa- 
tions between one-way and two-way protocols in the cases of limited nondeterministic and quantum 
communication, and we show that zero-error quantum one-way communication complexity asymp- 
totically equals deterministic one-way communication complexity for total functions. 
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minism, lower bounds, computational complexity 
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1. Introduction. One of the most important goals of complexity theory is to 
prove lower bounds on the size of Boolean circuits computing some explicit functions. 
Currently only linear lower bounds for this complexity measure are known. It is well 
known that superlinear lower bounds are provable, however, if we restrict the circuits 
to fan-out one, i.e., if we consider Boolean formulae. The best known technique for 
providing these is due to Neciporuk see also |7]. It applies to Boolean formulae 
with arbitrary gates of fan-in two. For other methods applying to circuits over a less 
general basis of gates see e.g. [7]. The largest lower bounds provable with Neciporuk's 
method are of the order 0(n 2 / logn). 

The complexity measure of formula size is not only interesting because formulae 
are restricted circuits which are easier to handle in lower bounds, but also because the 
logarithm of the formula size is asymptotically equivalent to the circuit depth. Thus 
increasing the range of lower bounds for formula size is interesting. 

It has become customary to consider randomized algorithms as a standard model 
of computation. While randomization can be eliminated quite efficiently using the 
nonuniformity of circuits, randomized circuits are sometimes simpler to describe and 
more concise than deterministic circuits. It is natural to ask whether we can prove 
lower bounds for the size of randomized formulae. 

More generally, we like to consider different modes of computation other than 
randomization. First we are interested in nondeterministic formulae. It turns out that 
general nondeterministic formulae are as powerful as nondeterministic circuits, and 
thus intractable for lower bounds with current techniques. But this construction relies 
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heavily on a large consumption of nondeterministic bits guessed by the simulating 
formula, in other words such a simulation drastically increases the length of proofs 
involved. So we can ask whether the size of formulae with a limited number of 
nondeterministic guesses can be lower bounded, in the spirit of research on limited 
nondeterminism |15| . 

Finally, we are interested in quantum computing. The model of quantum for- 
mulae has been introduced by Yao in |42|. He gives a superlinear lower bound for 
quantum formulae computing the MAJORITY function. Later Roychowdhury and 
Vatan proved that a somewhat weaker form of the classical Neciporuk method 
can be applied to give lower bounds for quantum formulae of the order 0(n 2 / log 2 n), 
and that quantum formulae can actually be simulated quite efficiently by classical 
Boolean circuits. 

The outline of this paper is the following. First we observe that the Neciporuk 
method can be defined in terms of one-way communication complexity While this 
observation is not relevant for deterministic computations, its power becomes useful 
if we consider other modes of computation. First we consider probabilistic formulae. 
We derive a variation of the Neciporuk bound in terms of randomized communication 
complexity and, using results from that combinatorial variant involving the 

VC-dimension. Applying this lower bound we show a near-quadratic lower bound for 
probabilistic formula size (corollary 3.7). We also show that there is a function, for 
which probabilistic formulae are smaller by a factor of y/n than deterministic formulae 
and even Las Vegas (zero error) formulae (corollary 3.13). This is shown to be the 
maximal such gap provable if the lower bound for deterministic formulae is given by 
the Neciporuk method. Furthermore we observe that the standard Neciporuk bound 
asymptotically also works for Las Vegas formulae. 

We then introduce Neciporuk methods for nondeterministic formulae and for 
quantum formulae. To apply these generalizations we have to provide lower bounds 
for one-way communication complexity with limited nondeterminism, and for quan- 
tum one-way communication complexity. For both measures lower bounds explicitly 
depending on the one-way restriction were unknown prior to this work. Since the 
communication problems we investigate are asymmetric (i.e., Bob receives much fewer 
inputs than Alice) our results show optimal separations between one- and two- round 
communication complexity for limited nondeterministic and for quantum communica- 
tion complexity. Such separations have been known previously only for deterministic 
and probabilistic protocols, see [23 Ell- 
in the nondeterministic case we give a specific combinatorial argument for the 
communication lower bound (Theorem 5.5). In the quantum case we give a general 
lower bound method based on the VC-dimension (Theorem 5.9), that can also be 
extended to the case where the players share prior entanglement. Furthermore we 
show that exact and Las Vegas quantum one-way communication complexity are 
never much smaller than deterministic one-way communication complexity for total 
functions (theorems 5.11/5.12). 

Then we are ready to give Neciporuk style lower bound methods for nondeter- 
ministic formulae and quantum formulae. In the nondeterministic case we show that 
for an explicit function there is a threshold on the amount of nondeterminism needed 
for efficient formulae, i.e., a near-quadratic size gap occurs between formulae allowed 
to make a certain amount of nondeterministic guesses, and formulae allowed a log- 
arithmic factor more. The threshold is polynomial in the input length (Theorem 
6.4). 
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For quantum formulae we show a lower bound of fl(n 2 / logn), improving on the 
best previously known bound given in 38 (Theorem 6.11). More importantly, our 
bound also applies to a more general model of quantum formulae, which are e.g. al- 
lowed to access multiple read random variables. This feature makes these generalized 
quantum formulae a proper generalization of both quantum formulae and probabilistic 
formulae. It turns out that we can give a Q(y/n/ \ogn) separation between formulae 
with multiple read random variables and without this option, even if the former are 
classical and the latter are quantum (corollary 6.6). Thus quantum formulae as de- 
fined by Yao are not capable of efficiently simulating classical probabilistic formulae. 
We show that the VC-dimcnsion variant of the Neciporuk bound holds for generalized 
quantum formulae and the standard Neciporuk bound holds for generalized quantum 
Las Vegas formulae (Theorem 6.10). 

The organization of the paper is as follows: in §2 we describe some preliminar- 
ies regarding the VC-dimension, classical communication complexity, and Boolean 
circuits. In §3 we expose the basic lower bound approach and apply the idea to 
probabilistic formulae. In §4 we give more background on quantum computing and 
information theory. In §5 we give the lower bounds for nondeterministic and quantum 
one-way communication complexity. In §6 we derive our results for nondeterministic 
and quantum formulae and apply those bounds. In §7 we give some conclusions. 

2. Preliminaries. 

2.1. The VC-dimension. We start with a useful combinatorial concept |40|. 
the Vapnik-Chervonenkis dimension. This will be employed to derive lower bounds for 
one-way communication complexity and then to give generalizations of the Neciporuk 
lower bound on formula size. 

Definition 2.1. A set S is shattered by a set of Boolean functions T , if for all 
R C S there is a function f S T , so that for all x S S : f(x) = 1 x £ R. 

The size of a largest set shattered by T is called the VC- Dimension VC{T) of T . 

The following fact [1U| will be useful. 

Fact 2.2. Let T be a set of Boolean functions f : X — > {0, 1}. Then 



2.2. One-way communication complexity. We now define the model of one- 
way communication complexity, first described by Yao |41| . See |28) for more details 
on communication complexity. 

Definition 2.3. Let f : X x Y — > {0, 1} be a function. Two players Alice 
and Bob with unrestricted computational power receive inputs x G X, y 6 Y to the 
function. 

Alice sends a binary encoded message to Bob, who then computes the function 
value. The complexity of a protocol is the worst case length of the message sent (over 
all inputs). 

The deterministic one-way communication complexity of f , denoted D{f), is the 
complexity of an optimal deterministic protocol computing f . 

In the case Bob sends one message and Alice announces the result we use the 
notation D B (f). 

The communication matrix of a function f is the matrix M with M(x, y) = f(x, y) 
for all inputs x,y. 



4 



H. Klauck 



We will consider different modes of acceptance for communication protocols. Let 
us begin with nondeterminism. 

Definition 2.4. In a nondeterministic one-way protocol for a Boolean function 
f : X xY — > {0, 1} Alice first guesses nondeterministically a sequence of s bits. Then 
she sends a message to Bob, depending on the sequence and her own input. Bob 
computes the function value. Note that the guessed sequence is only known to Alice. 
An input is accepted, if there is a guess, so that Bob accepts given the message and 
his input. All other inputs are rejected. 

The complexity of a nondeterministic one-way protocol with s nondeterministic 
bits is the length of the longest message used. 

The nondeterministic communication complexity N(f) is the complexity of an 
optimal one-way protocol for f using arbitrarily many nondeterministic bits. 

N s (f) denotes the complexity of an optimal nondeterministic protocol for f , which 
uses at most s private nondeterministic bits for every input. 

Note that if we do not restrict the number of nondeterministic bits, then nonde- 
terministic protocols with more than one round of communication can be simulated: 
Alice guesses a dialogue, sends it if it is consistent with her input, Bob checks the 
same with his input and accepts if acceptance is implied by the dialogue. 

While nondeterministic communication is a theoretically motivated model, prob- 
abilistic communication is the most powerful realistic model of communication besides 
quantum mechanical models. 

Definition 2.5. In a probabilistic protocol with private random coins Alice and 
Bob each possess a source of independent random bits with uniform distribution. The 
players are allowed to access that source and communicate depending on their inputs 
and the random bits they read. We distinguish the following modes of acceptance: 

1. In a Las Vegas protocol the players are not allowed to err. They may, how- 
ever, give up without an output with some probability e. The complexity of a one-way 
protocol is the worst case length of a message used by the protocol, the Las Vegas com- 
plexity of a function f is the complexity of an optimal Las Vegas protocol computing 
f, and is denoted Ro, e (f). 

2. In a probabilistic protocol with bounded error e the output has to be correct 
with probability at least 1 — e. The complexity of a protocol is the worst case length of 
the message sent (over all inputs and the random guesses), the complexity of a function 
is the complexity of an optimal protocol computing that function and is denoted R e (f). 
For e = 1/3 the notation is abbreviated to R{f). 

3. A bounded error protocol is a Monte Carlo protocol, if inputs with f(xA, xb) = 
are rejected with certainty. 

We also consider probabilistic communication with public randomness. Here the 
players have access to a shared source of random bits without communicating. Com- 
plexity in this model is denoted R pub , with acceptance defined as above. 

The difference between probabilistic communication complexity with public and 
with private random bits is actually only an additive O(logn) as shown in jSJ] by an 
argument based on the nonuniformity of the model. 

The following communication problems are frequently considered in the literature 
about communication complexity. 

Definition 2.6. Disjointness problem 
DISJ n {xi . . . x n , yi . . . y n ) = 1 Vi : ~^Xi V -ij/j. The function accepts, if the two 

sets described by the inputs are disjoint. 

Index function 
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IX 2 n {x x ... x T - ,y 1 ...y n ) = l <*=S> x y = 1. 

The deterministic one-way communication complexity of a function can be char- 
acterized as follows. Let row(f) be the number of different rows in the communication 
matrix of /. 

Fact 2.7. D(f) = [logroiu(/)~|. 

It is relatively easy to estimate the deterministic one-way communication com- 
plexity using this fact. As an example consider the index function, note that obviously 
D B (IX n ) — log n. It is easy to see with Fact 2.7 that D(IX n ) = n, since there are 
2™ different rows in the communication matrix of IX n . In |27j it is shown that also 
RF ub {IX n ) = fi(n). 

A general lower bound method for probabilistic one-way communication complex- 
ity is shown in |27j . 

We consider the VC-dimension for functions as follows. 

Definition 2.8. For a function f : X x Y —> {0, 1} let T = {g\3x e X : Vy e 
Y : g(y) = f(x,y)}. Then define VC(f) = VC{T). 
Fact 2.9. RP ub {f) = fl(VC(f)) 

In §5.2 we will generalize this result to quantum one-way protocols. 

With the above definition flog = D(f). Then VC(f) < D(f) < \\og(\Y\ + 
1) ■ VC(f)] due to Fact 2.2. 

Las Vegas communication can be quadratically more efficient than deterministic 
communication in many-round protocols for total functions |28| . For one-way proto- 
cols the situation is different |20) . 

Fact 2.10. For all total functions f: 

*S?/a(/) > D U)I 2 - 

We will also generalize this result to quantum communication in §5.2. In our 
proofs for these generalizations we will employ quantum information theoretic meth- 
ods as opposed to the proofs in the classical case, which were relying on combinatorial 
techniques. 

2.3. Circuits and formulae. We now define the models of Boolean circuits and 
formulae. Note that we do not consider questions of uniformity of families of such 
circuits. For the definition of a Boolean circuit we refer to [7]. We consider circuits 
with fan-in 2. While it is well known that almost all / : {0,1}™ — ► {0,1} need circuit 
size Q(2 n /n) (see e.g. 0), superlinear lower bounds for explicit functions are only 
known for restricted models of circuits. 

Definition 2.11. A (deterministic) Boolean formula is a Boolean circuit with 
fan-in 2 and fan- out 1. The Boolean inputs may be read arbitrarily often, the gates 
are arbitrary, constants 0,1 may be read. 

The size (or length) of a deterministic Boolean formula is the number of its non- 
constant leaves. 

It is possible to show that for Boolean functions the logarithm of the formula size 
is linearly related to the optimal circuit depth (see 0). 

Probabilistic formulae have been considered in [391 EI] with the purpose of 
constructing efficient (deterministic) monotone formulae for the majority function in 
a probabilistic manner. 

The ordinary model of a probabilistic formula is a probability distribution on 
deterministic formulae. Since formulae are also an interesting datastructure we are 
interested in a more compact model. "Fair" probabilistic formulae are formulae that 
read input variables plus additional random variables. The other model will be called 
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"strong" probabilistic formulae. 

Definition 2.12. A fair probabilistic formula is a Boolean formula, which works 
on input variables and additional random variables . . . ,T m , a strong probabilis- 
tic formula is a probability distribution F on deterministic Boolean formulae. Fair 
resp. strong probabilistic formulae F compute a Boolean function f with bounded er- 
ror, if 

¥v[F{x) + f(x)} < 1/3. 

Fair resp. strong probabilistic formulae F are Monte Carlo formulae for f (i.e., 
have one-sided error), if 

Pr[F(x) = 0\f(x) = 1] < 1/2 and Pr{F(x) = l\f{x) = 0] = 0. 

A Las Vegas formula consists of 2 Boolean formula. One formula computes the 
output, the other (verifying) formula indicates whether the computation of the first can 
be trusted or not. Both work on the same inputs. There are four different outputs, of 
which two are interpreted as "?" (the verifying formula rejects), and the other as 
resp. 1. A Las Vegas formula F computes f , if the outputs and 1 are always correct, 
and 

Yi[F(x) =?] < 1/2. 

The size of a fair probabilistic formula is the number of its nonconstant leaves, 
the size of a strong probabilistic formula is the expected size of a deterministic formula 
according to F. 

It is easy to sec that one can decrease the error probability to arbitrarily small 
constants, while increasing the size by a constant factor, therefore we will sometimes 
allow different error probabilities. 

A strong probabilistic formula F can be transformed into a deterministic formula. 
For Monte Carlo formulae this increases the size by a factor of 0(n): choose 0(n) 
formulae randomly according to F and connect them by an OR gate. An application of 
the Chernov inequality proves that the error probability is so small that no errors are 
possible anymore. Strong formulae with bounded (two-sided) error are derandomized 
by picking 0(n) formulae and connecting them by an approximative majority function. 
That function outputs 1 on n Boolean variables if at least 2n/3 have the value 1, and 
outputs 0, if at most n/3 variables have the value 1. An approximative majority 
function can be computed by a deterministic formula of size 0{n 2 ), see [20 Thus 
the size increases by a factor of 0(n 2 ). 

Let us remark that strong probabilistic formulae may have sublinear length, this 
is impossible for fair probabilistic formulae depending on all inputs. An approximative 
majority function may be computed by a strong probabilistic formula through picking 
a random input and outputting its value. 

We will later also consider nondeterministic formulae. 

Definition 2.13. A nondeterministic formula with s nondeterministic bits is a 
formula with additional input variables a%, . . . ,a s . The formula accepts an input x, if 
there is a setting of the variables a, so that (a, x) is accepted. 

3. The general lower bound method and probabilistic formulae. There 
are some well known results giving lower bounds for the length of Boolean formulae. 
The method of Neciporuk |33l [7] remains the one giving the largest lower bounds 
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among those methods working for formulae in which all fan-in 2 functions are allowed 
as gates. For other methods see [7| and [3]; a characterization for formula size with 
gates AND, OR, NOT using the communication complexity of a certain game is also 
known (see [28\). For such formulae the largest known lower bound is a near-cubic 
bound due to Hastad [Tfi] . 

Let us first give the standard definition of the Neciporuk bound. 

Let / be a function on the n variables in X = {x\ 1 . . . , x n }. For a subset SCI 
let a subfunction on S be a function induced by / by fixing the variables in X — S. 
The set of all subfunctions on S is called the set of S-sub functions of /. 

Fact 3 . 1 (Neciporuk) . Let f be a Boolean function on n variables. Let Si, . . . ,Sk 
be a partition of the variables and Si the number of Si-sub functions on f. Then every 
deterministic Boolean formulae for f has size at least 

k 

(1/4) log Si . 
i=i 

It is easy to see that the Neciporuk function (1/4) 2~2i=i l°g s i is never larger than 
n 2 / logn. 

Definition 3.2. The function "indirect storage access" ISA is defined as follows: 
there are three blocks of inputs U,X,Y with \U\ — logn — log logn, \X\ — \Y\ = n. 
U addresses a block of length logn in X , which addresses a bit in Y . This bit is the 
output, thus ISA(U, X, Y) = Y Xu . 

The following is proved e.g. in [7] . |43| . 

Fact 3.3. Every deterministic formula for ISA has size fl(n 2 / logn). 

There is a deterministic formula for ISA with size 0(n 2 /logn). 

We are now going to generalize the Neciporuk method to probabilistic formulae, 
and later to nondeterministic and quantum formulae. We will use a simple connection 
to one-way communication complexity and use the guidance obtained by this connec- 
tion to give lower bounds from lower bounds in communication complexity. In the 
case of probabilistic formulae we will employ the VC-dimension to give lower bounds. 
Informally speaking we will replace the log of the size of the set of subfunctions by 
the VC-dimension of that set and get a lower bound for probabilistic formulae. 

Our lower bounds are valid in the model of strong probabilistic formulae. Corol- 
lary 3.7 shows that even strong probabilistic formulae with two-sided error do not 
help to decrease the size of formulae for ISA. All upper bounds will be given for fair 
formulae. 

We are going to show that the (standard) Neciporuk is at most a factor of 0(y/n) 
larger than the probabilistic formula size for total functions. Thus the maximal gap 
we can show using the currently best general lower bound method is limited. 

On the other hand we describe a Boolean function, for which fair probabilistic 
formulae with one-sided error are a factor 0(y / n) smaller than Las Vegas formulae, as 
well as a similar gap between one-sided error formulae and two-sided error formulae. 
The lower bound on Las Vegas formulae uses the new observation that the standard 
Neciporuk bound asymptotically also works for Las Vegas formulae. 

3.1. Lower bounds for probabilistic formulae. We now derive a Neciporuk 
type bound with one-way communication. 

Definition 3.4. Let f be a Boolean function on n inputs and let y\.. .yu be a 
partition of the input variables. 
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We consider k communication problems for i = l,...,k. Player Bob receives 
all inputs in y i7 player Alice receives all other inputs. The deterministic one-way 
communication complexity of f under this partition of inputs is called D(fi). The 
public coin bounded error one-way communication complexity of f under this partition 
of inputs is called R pub (fi). 

The probabilistic Neciporuk function is (1/4) J2i R pub {h)- 

It is easy to sec that (1/4) J2i D(fi) coincides with the standard Neciporuk func- 
tion and is therefore a lower bound for deterministic formula size due to Fact 3.1. 

Theorem 3.5. The probabilistic Neciporuk function is a lower bound for the size 
of strong probabilistic formulae with bounded error. 

Proof. We will show for every partition y\ , . . . , y/. of the inputs, how a strong 
probabilistic formula F can be simulated in the k communication games. Let Fi be 
the distribution over deterministic formulae on variables in y t induced by picking a 
deterministic formula as in F and restricting to the subformula with all leaves labeled 
by variables in yi and containing all paths from these to the root. We want to simulate 
the formula in game i so that the probabilistic one-way communication is bounded 
by the expected number of leaves in Fi . 

We are given a probabilistic formula F. The players now pick a deterministic 
formula F' induced by F with their public random bits, Player Alice knows all the 
inputs except those in yi. This also fixes a subformula F[ drawn from Fi. Actu- 
ally the players have only access to an arbitrarily large public random string, so the 
distributions Fi may only be approximated within arbitrary precision. This alters suc- 
cess probabilities by arbitrary small values. We disregard these marginal probability 
changes. 

Let Vi contain the vertices in which have 2 predecessors in F[, and let 
contain all paths, which start in Vi or at a leaf, and which end in Vi or at the root, 
but contain no further vertices from Vi. It suffices, if Alice sends 2 bits for each such 
path, which shows, whether the last gate of the path computes 0, l,g, or ->g, for the 
function g computed by the first gate of the path. Then Bob can evaluate the formula 
alone. 

There are at most 2|Vj| + 1 paths as described, since the fan- in of the formula is 
2. Thus the overall communication is 4\Vi\ + 2. The set of leaves Lj with variables 
from yi has \Vi\ + 1 elements, and thus 

RP ub (f l )<4\V\ + 2<4\L l \ 

and l/AY,iR pub {fi) is a lower bound for the length \L t \] = Y,i E [\ L i\\ of thc 

probabilistic formula. □ 

Let VC(fi) denote the VC-dimension of the communication problem We call 
E 4 VC(fi) the VC-Neciporuk function. 

COROLLARY 3.6. The VC-Neciporuk function is an asymptotical lower bound for 
the length of strong probabilistic formulae with bounded error. 

The standard Neciporuk function is an asymptotical lower bound for the length of 
strong Las Vegas formulae for total functions. 

Proof. Using Fact 2.9 the VC-dimension is an asymptotical lower bound for the 
probabilistic public coin bounded error one-way communication complexity. 

As in the proof of Theorem 3.5 we may simulate a Las Vegas formula by Las 
Vegas public coin one-way protocols. Using Fact 2.10 public coin Las Vegas one-way 
protocols for total functions can only be a constant factor more efficient than optimal 
deterministic one-way protocols. □ 
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According to Fact 3.3 the deterministic formula length of the indirect storage 
access function (ISA) from definition 3.2 is 0(n 2 /logn). We now employ our method 
to show a lower bound of the same order for strong bounded error probabilistic for- 
mulae. Thus ISA is an explicit function for which strong probabilism does not allow 
to decrease formula size significantly. 

Corollary 3.7. Every strong probabilistic formula for the ISA function (with 
bounded error) has length Vt{n 2 /\ogn). 

Proof. ISA has inputs Y, X, U and computes Yx v ■ First we define a partition. 
We partition the inputs in X into n/logn blocks containing logn bits each, all other 
inputs arc in one additional block. In a communication game Alice receives thus all 
inputs but those in one block of X. Let S denote the set of possible values of the 
variables in that block. This set is shattered: Let R C S and R = {n, . . . , r m }. Then 
set the pointer U to the block of inputs belonging to Bob, and set Yi = 1 i e R. 

Thus the VC-dimension of fa is at least \S\ — n. Since there are n/ logn commu- 
nication games, the result follows. □ 

The next result would be trivial for deterministic or for fair probabilistic formulae, 
but strong probabilistic formulae can compute functions depending on all inputs in 
sublinear size. Consider e.g. the approximate majority function. This partial function 
can be computed by a strong probabilistic formulae of length 1 by picking a random 
input variable. For total functions on the other hand we have: 

Corollary 3.8. Every strong probabilistic formula, which computes a total func- 
tion depending on n variables has length O(n). 

Proof. We partition the inputs into n blocks containing one variable each. In a 
communication game Alice receives thus n—1 variables, and Bob receives 1 variable. 
Since the function depends on both Alice's and Bob's inputs, the deterministic com- 
munication complexity is at least 1 . If the probabilistic one-way communication were 
0, the error would be 1/2, thus the protocol would not compute correctly. □ 

Fact 2.2 shows that for a function / : X x Y — » {0, 1} it is true that D(f) < 
\VC(f) • logflY | + 1)] . This leads to 

Theorem 3.9. For all total functions f : {0, 1}™ — > {0, 1} having a strong 
probabilistic formula of length s, and for all partitions of the inputs of f : 

s 

Proof. Obviously D(fa) < n for all i. Since a partition of the inputs can contain 
at most y/n blocks with more than y/n variables, these contribute at most riy/n to the 
Neciporuk function J2 D(fa). All smaller blocks satisfy D(fa) < \y/n ■ VC(fa)] . Thus 
overall £ D(fa) < 0{y/n\n + Y,VC{fa))) = 0{y/ns), with corollary 3.8 and Theorem 
3.5. □ 

If a total function has an efficient (say linear length) probabilistic formula, then 
the Neciporuk method does not give near-quadratic lower bounds. 

3.2. A function, for which Monte Carlo probabilism helps. We now de- 
scribe a function, for which Monte Carlo probabilism helps as much as we can possibly 
show under the constraint that the lower bound for deterministic formulae is given 
using the Neciporuk method. We find such a complexity gap even between strong Las 
Vegas formulae and fair Monte Carlo formulae. 

Definition 3.10. The matrix product function MP receives two n x n-matrices 
j 1 ( 1 ) ) 7 1 ( 2 ) over 2 2 as input and accepts if and only if their product is not the all zero 
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matrix. 

Theorem 3.11. The MP function can be computed by a fair Monte Carlo for- 
mula of length 0(n 2 ). 

Proof. We use a fingerprinting technique similar to the one used in matrix product 
verification |31|. but adapted to be computable by a formula. First we construct a 
vector as a fingerprint for each matrix using some random input variables. Then 
we multiply the fingerprints and obtain a bit. This bit is always zero, if the matrix 
product is zero, otherwise it is 1 with probability 1/4. Thus we obtain a Monte Carlo 
formula. 

Let r^\ A 2 ^ be random strings of n bits each. The fingerprints are defined as 

n n 

jX%]=@r«[i]T«[i,fc] and F^[k] = 0r< 2 '[fc, j]r^[j}. 

t=l 3=1 

Then let 

n 

& = J F«[fe]AF«[fe]. 

k=\ 

Obviously b can be computed by a formula of linear length. 

Assume T«T( 2 ) = 0. Then b = r^T^T^r^ = for all r« and A 2 \ 

If on the other hand T^T^ ^ 0, then i,j exist such that fc [i, k]T^ [k, j] 

1 . Fix all random bits except r^[i] and r( 2 )[j] arbitrarily. Note that 

n / n 

b= [r^W 2) b]-®TU[i,k}TW[kJ] 

i,j=l \ k=l 

Regardless how the values of sums for other i, j look, one of the values of rW[i] and 
r^lj] yields the result b—\, this happens with probability 1/4. □ 

Theorem 3.12. For the MP function a lower bound offl(n 3 ) holds for the length 
of strong Las Vegas formulae. 

Proof. We use the Neciporuk method. First the partition of the inputs has to be 
defined. There are n blocks bj with the bits T^ 2 \i,j) for i = 1, . . . , n plus one block 
for the remaining inputs. Then Alice receives all inputs except n bits in column j 
of the second matrix, i.e., T^'(-,j), which go to Bob. We show that MP has now 
one-way communication complexity fl (n 2 ). The Neciporuk method then gives us a 
lower bound of f2(n 3 ) for the length of deterministic and strong Las Vegas formulae. 
W.l.o.g. assume Bob has the bits T^(i, 1). 

We construct a set of assignments to the input variables of Alice. Let U be a 
subspace of ZJ? and Tjj be a matrix with Tjjx = <^=> x 6 U. For every U 
we choose T v as and T^(i,j) = for all i and for j > 2. If there are 2 n< - n ^ 
pairwise different subspaces, then we get that many different inputs. But these inputs 
correspond to different rows in the communication matrix, since all TW have different 
kernels. Thus with corollary 3.6 the Las Vegas one-way communication is f2(n 2 ). 

To see that there are 2 fi ( n2 ) pairwise different subspaces of 2^ we count the sub- 
spaces with dimension at most n/2. There are 2™ vectors. There are (^/ 2 ) possibilities 
to choose a set of n/2 pairwise different vectors. Each such set generates a subspace 
of dimension at most n/2. Each such subspace is generated by at most ( 2 / 2 ) sets 
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of n/2 pairwise different vectors from the subspace. Hence this number is an upper 
bound on the number of times a subspace is counted and there are at least 

(»/2) ^ J2(n 2 ) 
\n/2) 

pairwise different subspaces of TL 1 ^ . D 

Corollary 3.13. There is a function, that can be computed by a fair Monte 
Carlo formula of length 0{N), while every strong Las Vegas formula needs length 
Q(N 3 / 2 ) for this task, i.e., there is a size gap of Q(N 1/l2 ) between Las Vegas and 
Monte Carlo formulae. 

There is also a size gap of ^(TV 1 / 2 ) between Monte Carlo formulae and bounded 
error probabilistic formulae. 

Proof. The first statement is proved in the previous theorems. For the second 
statement we consider the following function with 4 matrices as input. The function 
is the parity of the MP function on the first two matrices and the complement of 
MP on the other two matrices. 

A fair probabilistic formula can compute the function obviously with length 0(n 2 ) 
following the construction in Theorem 3.11. Assume we have a Monte Carlo formula, 
then fix the first two input matrices once in a way so that their product is the 
matrix, and then so that their product is something else. In this way one gets Monte 
Carlo formulae for both MP and its complement. Then one can use both formulae 
on the same input and combine their results to get a Las Vegas formula, which leads 
to the desired lower bound with Theorem 3.12. 

For the construction of a Las Vegas formula let F be the Monte Carlo formula for 
MP and G be the Monte Carlo formula for ^MP. Then F and are formulae for 
MP, so that F never erroneously accepts and is correct with probability 1/2, and ^G 
never erroneously rejects and is correct with probability 1/2. Assuming the function 
value is 0, then F rejects. With probability 1/2 also ^G rejects, otherwise we may 
give up. Assuming the function value is 1, then -iG accepts. With probability 1/2 also 
F accepts, otherwise we may give up. The other way round, if both formulae accept 
or both reject we can safely use this result, and this result comes up with probability 
1/2, the only other possible result is that F rejects and ^G accepts, in this case we 
have to give up. □ 

The formula described in the proof of Theorem 3.11 has the interesting property 
that each input is read exactly once, while the random inputs are read often. MP 
cannot be computed by a deterministic formula reading the inputs only once, since 
this contradicts the size bound of Theorem 3.12. Later we will show that MP cannot 
be computed substantially more efficient by a fair probabilistic formula reading its 
random inputs only once than by deterministic formulae. This follows from a lower 
bound for the size of such formulae given by the Neciporuk function divided by log n 
(corollary 6.7). For MP read-once random inputs are practically useless. 

4. Background on quantum computing and information. In this section 
we define more technical notions and describe results we will need. We start with 
information theory, then define the model of quantum formulae and give results from 
quantum information theory. We also discuss programmable quantum gates. These 
results are used in the following section to give lower bounds for one-way commu- 
nication complexity. Then we proceed to apply these to derive more formula size 
bounds. 
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4.1. Information theory. We now define a few notions from classical informa- 
tion theory, see e.g. |11|. 

Definition 4.1. Let X be a random variable with values S — {x\, . . . , x n }. 

The entropy of X is H(X) = - Y, xe s Pv ( X = x ) logPr(W = x). 

The entropy of X given an event E is 
H(X\E) = -Y: xe sV*{ x = AE)\og¥r{X = x\E). 

The conditional entropy of X given a random variable Y is 
H(X\Y) = Pr(Y = y)H(X\Y — y), where the sum is over the values ofY. Note 
that H(X\Y) = H(XY) - H(Y). 

The information between X and Y is H(X : Y) = H(X) - H(X\Y). 

The conditional information between X and Y , given Z , is 
H(X : Y\Z) = H(XZ) + H{YZ) - H(Z) - H(XYZ). 

For a € [0, 1] we define H(a) — —a log a — (1 — a) log(l — a). 

All of the above definitions use the convention OlogO = 0. 

The following result is a simplified version of Fano's inequality, see |11| . 

Fact 4.2. If X,Y are Boolean random variables with Pv{X ^= Y) < e, then 
H(X : Y) > H(X) - H(e). 

Proof Let Z = 1 <S=^> X = Y and Z = Then H(X\Y) = 

H(XY) - H(Y) = H{ZY) - H(Y) < H(Z) < H(e). □ 

The next lemma is similar in the sense of a " Las Vegas variant" . 

Lemma 4.3. Let X be a random variable with a finite range of values S and let 
Y be a random variable with range S U {£?}, so that Pr(Y" = x\X = x) > 1 — e for all 
x e S, Pr(Y = x\X ^ x) = for all x ^ xi and Pr(Y~ = x?\X — x) < e for all x e S. 
Then H(X : Y) > (1 - e)H(X). 

Proof H(X : Y) = H(X) - H{X\Y). Let 6 = Pr(Y~ = x ? ) < e and e x = Py(Y = 
xj\X = x) < e and p x = Pr(X = x). 

H(X\Y)< (1 - 5)H{X\Y ^ x-i) + 6H{X\Y = x 7 ) 
= SH{X\Y = xt) 

= -Sj2 Pv ( x = A Y = a;?)log(Pr(X = x\Y = 

X 

= S y~l(e x p x / 6) log(e x p x /6) 

X 

< -€^p x \ogp x + S^2(e x p x /5) \og(S/e x ) 

X X 

< eH(X) + 5 log Px with Jensen's inequality 

X 

< eH(X). □ 

4.2. Quantum computation. We refer to for a thorough introduction into 
the field. Let us briefly mention that pure quantum states are unit vectors in a 
Hilbert space written \tp), inner products are denoted {ip\<p), and the standard norm 
is II IV 7 ) II = VvPW)- Outer products |'*/')('^| are matrix valued. 

In the space C 4 we will not only consider the standard basis {|00), |01), |10), |11)}, 
but also the Bell basis consisting of 

|$+> = -4(100) + |11)), |$-> = -4(100) - |11)), 
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l*+> = ^(|oi> + |io», |*-) = -^(|01) - |10)). 

The dynamics of a discrete time quantum system is described by unitary opera- 
tions. A very useful operation is the Hadamard transform. 

H 2 = 1 fl 1 



V2 V 1 - 1 

Then H n = H 2 <8> • • • <8> H 2 , is the n-wise tensor product of H 2 . 
«■ , ' 

n 

The XOR operation is defined by XOR : \x,y) — > \x, x® y) on Boolean values 

x,y. 

Furthermore measurements are fundamental operations. Measuring as well as 
tracing out subsystems leads to probabilistic mixtures of pure states. 

Definition 4.4. An ensemble of pure states is a set {(pi, \4>i))\l < i < k}. Here 
the pi are the probabilities of the pure states \4>i) ■ Such an ensemble is called a mixed 
state. 

The density matrix of a pure state \<j>) is the matrix \<j))((j>\, the density matrix of 
a mixed state {(pi, \4>i))\l < i < k} is 



~52pi\(/>i)(<i>i 



A density matrix is always Hermitian, positive semidefinite, and has trace 1. 
Thus a density matrix has nonnegative eigenvalues that sum to 1. The results of all 
measurements of a mixed state are determined by the density matrix. 

A pure state in a Hilbert space H = Ha <8> Hb cannot in general be expressed as 
a tensor product of pure states in the subsystems. 

Definition 4.5. A mixed state {(pi, \(j>i})\l < i < k} in a Hilbert space Hi®H 2 is 
called separable, if it has the same density matrix as a mixed state {(ft, \4>l)® IV\ 2 ))I* = 
1, . . . . k'} for pure states from Hi and \ipf) from H 2 with '^2 i qi = 1 and qi > 0. 
Otherwise the state is called entangled. 

Consider e.g. the state |$+) = ^(|00) + 1 1 1> ) in C 2 (g)C 2 . The state is entangled 
and is usually called an EPR-pair. This name refers to Einstein, Podolsky, and Rosen, 
who first considered such states [Tl] . 

Linear transformation on density matrices are called superoperators. Not all 
superoperators are physically allowed. 

Definition 4.6. A superoperator T is positive, if it sends positive semidefinite 
Hermitian matrices to positive semidefinite Hermitian matrices. A superoperator is 
trace preserving, if it maps matrices with trace 1 to matrices with trace 1. 

A superoperator T is completely positive, if every superoperator T ® I p is positive, 
where Ip is the identity superoperator on a finite dimensional extensional F of the 
underlying Hilbert space. 

A superoperator is physically allowed, iff it is completely positive and trace pre- 
serving. 

The following theorem (called Kraus representation theorem) characterizes physi- 
cally allowed superoperators in terms of unitary operation, adding qubits, and tracing 
out [55] . 

Fact 4.7. The following statements are equivalent: 
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1. A superoperator T sending density matrices over a Hilbert space Hi to density 
matrices over a Hilbert space H 2 is physically allowed. 

2. There is a Hilbert space H% with dim{H^,) < dim{H\) and a unitary map U , 
so that for all density matrices p over H\ : 

Tp = trace Hl ®H 3 [U(p<g> \Oh 3 ®h 2 ) (Qh 3 ®h 2 |)£/ t ]- 

4.3. Quantum information theory. In this section we describe notions and 
results from quantum information theory. 

Definition 4.8. The von Neumann entropy of a density matrix px is S(X) = 
S(px) = -trace(p x logpx)- 

The conditional von Neumann entropy S(X\Y) of a bipartite system with density 
matrix pxY is defined as S(XY) — S(Y), where the state py of the Y system is the 
result of a partial trace over X . 

The von Neumann information between two parts of a bipartite system in a state 
Pxy * s S(X : Y) — S(X) + S(Y) — S(XY) (px and py are the results of partial 
traces). 

The conditional von Neumann information of a system in state pxyz is S(X : 
Y\Z) = S{XZ) + S{YZ)-S{Z)-S{XYZ). 

Let £ = {{pi, Pi)\i = 1, . . . ,k} be an ensemble of density matrices. The Holevo 
information of the ensemble is x(£) = S(J2iPiPi) — Y^iPi^iPi)- 

The von Neumann entropy of a density matrix depends on the eigenvalues only 
so it is invariant under unitary transformations. If the underlying Hilbert space has 
dimension d, then the von Neumann entropy of a density matrix is bounded by log d. 
A fundamental result is the so-called Holevo bound [TJ\, which states an upper bound 
on the amount of classical information in a quantum state. 

Fact 4.9. Let X be a classical random variable with Pr(A = x) = p x . Assume for 
each x a quantum state with density matrix p x is prepared, i.e., there is an ensemble 
£ = {(p x ,p x )\x = 0, ...,k}. Let pxz = J2 x =o P*\ x ) ( x \ ® Px- Let Y be a classical 
random variable which indicates the result of a measurement on the quantum state 
with density matrix pz = ^2 x PxPx- Then 

H(X : Y) < X (£) = S(X : Z). 
We will also need the following lemma. 

Lemma 4.10. Let £ = {(p x , cr x )\x — 0, . . . , k} be an ensemble of density matrices 
and let a = ^2 x p x o~ x be the density matrix of the mixed state of the ensemble. Assume 
there is an observable with possible measurement results x and ?, so that for all x 
measuring the observable on a x yields x with probability at least 1 — e, the result 
? with probability at most e, and a result x' ^ x with probability 0, then 

S(a) > ^ Px S(a x ) + (1 - e)H(X), i.e., X (£) > (1 - e)H(X). 

X 

Proof. States a; of a classical random variable X are coded as quantum states cr x , 
where x and a x have probability p x . The density matrix of the overall mixed state is 
a and has von Neumann entropy S(a). a corresponds to the "code" of a random x. 

According to Holevo's theorem (Fact 4.9) the information on X one can access 
by measuring a with result Y is bounded by H(X : Y) < S(a) ~ '}2 x p x S(a x ). But 
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there is such a measurement as assumed in the lemma, and with lemma 4.3 H{X : 
y) > (1 — e)H(X). Thus the lemma follows. □ 

Not all the relations that are valid in classical information theory hold in quantum 
information theory. The following fact states a notable exception, the so-called Araki- 
Lieb inequality and one of its consequences, see |3fi| . 

Fact 4.11. S(XY) > \S(X) - S(Y)\. 

S(X : Y\Z) < 2S(X). 

The reason for this behaviour is entanglement. 

Lemma 4.12. If a X y is separable, then S(XY) > S(X) and S(X : Y) < S(X). 

4.4. The quantum communication model. Now we define quantum one-way 
protocols. 

Definition 4.13. In a two player quantum one-way protocol players Alice and 
Bob each possess a private set of qubits. Some of the qubits are initialized to the 
Boolean inputs of the players, all other qubits are in some fixed basis state |0). 

Alice then performs some quantum operation on her qubits and sends a set of 
these qubits to Bob. The latter action changes the possession of qubits rather than the 
global state. We can assume that Alice sends the same number of qubits for all inputs. 
After Bob has received the qubits he can perform any quantum operation on the qubits 
in his possession and afterwards he announces the result of the computation. The 
complexity of a protocol is the number of qubits sent. 

In an exact quantum protocol the result has to be correct with certainty. 
is the minimal complexity of an exact quantum protocol for a function f . 

In a bounded error protocol the output has to be correct with probability 1 — e (for 
1/2 > e > 0). The bounded error quantum one-way communication complexity of a 
function f is Q e (f) resp. Q(f) = Qi/aif), the minimal complexity of a bounded error 
quantum one-way protocol for f . 

Quantum Las Vegas protocols are defined regarding acceptance as their probabilis- 
tic counterparts, the notation is Qa, e {f)- 

' t 10J considers a different model of quantum communication: Before the start of 
the protocol Alice and Bob own a set of qubits whose state may be entangled, but must 
be independent of the inputs. Then as above a quantum communication protocol is 
used. We use superscripts pub to denote the complexity in this model. 

It is possible to simulate the model with entangled qubits by allowing first an 
arbitrary finite communication independent of the inputs, followed by an ordinary 
protocol. 

By measuring distributed EPR-pairs it is possible to simulate classical public 
randomness. The technique of superdense coding of [5] allows in the model with prior 
entanglement to send n bits of classical information with \n/2\ qubits. 

4.5. Quantum circuits and formulae. Besides quantum Turing machines 
quantum circuits |12j are a universal model of quantum computation, see |42| . and 
are generally easier to handle in descriptions of quantum algorithms. A more general 
model of quantum circuits, in which superoperator gates work on density matrices is 
described in p^. We begin with the basic model. 

Definition 4.14. A unitary quantum gate with k inputs and k outputs is specified 
by a unitary operator U : C — > (D . 

A quantum circuit consists of unitary quantum gates with 0(1) inputs and outputs 
each, plus a set of inputs to the circuits, which are connected to an acyclic directed 
graph, in which the inputs are sources. Sources are labeled by Boolean constants or 



16 



H. Klauck 



by input variables. Edges correspond to qubits, the circuit uses as many qubits as it 
has sources. One designated qubit is the output qubit. A quantum circuit computes a 
unitary transformation on the source qubits in the obvious way. In the end the output 
qubit is measured in the standard basis. 

The size of a quantum circuit is the number of its gates, the depth is the length 
of the longest path from an input to the output. 

A quantum circuit computes a function with bounded error, if it gives the right 
output with probability at least 2/3 for all inputs. 

A quantum circuit computes a Boolean function with Monte Carlo error, if it has 
bounded error and furthermore never erroneously accepts. 

A pair of quantum circuits computes a Boolean function f in the Las Vegas sense, 
if the first is a Monte Carlo circuit for f , and the second is a Monte Carlo circuit for 

A quantum circuit computes a function exactly, if it makes no error. 

The definition of Las Vegas circuits is motivated by the fact that we can easily 
verify the computation of a pair of Monte Carlo circuits for / and ->/ as in the classical 
case, see the proof of corollary 3.13. 

We are interested in restricted types of circuits, namely quantum formulae |42| . 

Definition 4.15. A quantum formula is a quantum circuit with the following 
additional property: for each source there is at most one path connecting it to the 
output. The length or size of a quantum formula is the number of its sources. 

Apart from the Boolean input variables a quantum formula is allowed to read 
Boolean constants only. There is only one final measurement. We call the model from 
g21 also 

pure quantum formulae. 

In P a more general model of quantum circuits is studied, in which superoperators 
work on density matrices. 

Definition 4.16. A superoperator gate g of order (k,l) is a trace-preserving, 
completely positive map from the density matrices on k qubits to the density matrices 
on I qubits. 

A quantum superoperator circuit is a directed acyclic graph with inner vertices 
marked by superoperator gates with fitting fan- in and fan- out. The sources are marked 
with input variables or Boolean constants. One gate is designated as the output. 

A function is computed as follows. In the beginning the sources are each assigned 
a density matrix corresponding to the Boolean values determined by the input or by a 
constant. The Boolean value corresponds to |0)(0|, 1 to |1)(1|. The overall state of 
the qubits involved is the tensor product of these density matrices. 

Then the gates are applied in an arbitrary topological order. Applying a gate 
means applying the superoperator composed of the gates' superoperator on the chosen 
qubits for the gate and the identity superoperator on the remaining qubits. 

In the end the state of the output qubit is supposed to be a classical probability 
distribution on |0) and |1). 

The following fact from [l] allows to apply gates in an arbitrary topological or- 
dering. 

Fact 4.17. Let C be a quantum superoperator circuit, C± and C<z be two sets of 
gates working on different sets of qubits. Then for all density matrices p on the qubits 
in the circuit the result of C\ applied to the result of C2 on p is the same as the result 
of C2 applied to the result of C\ on p. 

Let two arbitrary topological orderings of the gates in a quantum superoperator 
circuit be given. The result of applying the gates in one ordering is the same as the 
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result of applying the gates in the other ordering for any input density matrix. 

One more aspect is interesting in the definition of quantum formulae: we want to 
allow quantum formulae to access multiple read random inputs, just as fair probabilis- 
tic formulae. This makes it possible to simulate the latter model. Instead of random 
variables we allow the quantum formulae to read an arbitrary nonentangled state. A 
pure state on k qubits is called nonentangled, if it is the tensor product of k states on 
1 qubit each. A mixed state is nonentangled, if it can be expressed as a probabilistic 
ensemble of nonentangled pure states. Note that a classical random variable read k 
times can be modelled as |l fc ) with probability 1/2 and |0 fc ) with probability 1/2. 

We restrict our definition to gates with fan-in 2, the set of quantum gates with 
fan-in 2 is known to be universal [3]. 

Definition 4.18. A generalized quantum formula is a quantum superoperator 
circuit with fan-out 1 /fan-in 2 gates together with a fixed nonentangled mixed state. 
The sources of the circuit are either labeled by input variables, or may access a qubit 
of the state. Each qubit of this state may be accessed only by one gate. 

As proved in l] the Kraus representation theorem (Fact 4.7) implies that quan- 
tum superoperator circuits with constant fan-in are asymptotically as efficient as 
quantum circuits with constant fan-in. The same holds for quantum formulae. The 
essential difference between pure and generalized quantum formulae is the availability 
of multiple read random bits. 

4.6. Programmable quantum gates. For simulations of quantum mechanical 
formulae by communication protocols we will need a programmable quantum gate. 
Such a gate allows Alice to communicate a unitary operation as a program stored in 
some qubits to Bob, who then applies this operation to some of his qubits. 

Formally we have to look for a unitary operator G with 



Here \Pjj) is the "code" of a unitary operator U, and \P{j) the some leftover of the 
code. 

The bad news is that such a programmable gate does not exist, as proved in [55], 
Note that in the classical case such gates are easy to construct. 

Fact 4.19. If N different unitary operators (pairwise different by more than 
a global phase) can be implemented by a programmable quantum gate, then the gate 
needs a program of length log N. 

Since there are infinitely many unitary operators on just one qubit there is no 
programmable qubit with finite program length implementing them all. The proof 
uses that the gate works deterministically, and actually a probabilistic solution to the 
problem exists. 

We now sketch a construction of Nielsen and Chuang ■ For the sake of sim- 
plicity we just describe the construction for unitary operations on one qubit. 
The program of a unitary operator U is 



The gate receives as input \d}® \Pjj}- The gate then measures the first and second 
qubit in the basis {|$ + ), |3>~), \^ + ), I* - )}- Then the third qubit is used as a result. 
For a state \d) = a\Q) + b\l) the input to the gate is 



G(\d) ® \Pu)) = U(\d}) ® \P{j). 



|P £/ ) = ^=(|0)f/|0) + |l)[/|l)). 
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= - [\<S> + )(aU\0) + bU\l)) + |$-)(o^|0) - bU\l)) 
+\V+)(aU\l) + bU\0)) + \*-)(aU\l) - bU\0))] . 

Thus the measurement produces the correct state with probability 1 /4 and more- 
over the result of the measurement indicates whether the computation was done cor- 
rectly. Also, given this measurement result we know exactly which unitary "error" 
operation has been applied before the desired operation. We now state Nielsen and 
Chuang's result. 

FACT 4.20. There is a probabilistic programmable quantum gate with m input 
qubits for the state plus 2m input qubits for the program, which implements every 
unitary operation on m qubits, and succeeds with probability 1/2 . The result of a 
measurement done by the gate indicates whether the computation was done correctly, 
and which unitary error operation has been performed. 

5. One-way communication complexity: the nondeterministic and the 
quantum case. 

5.1. A lower bound for limited nondeterminism. In this section we inves- 
tigate nondeterministic one-way communication with a limited number of nondeter- 
ministic bits. Analogous problems for many round communication complexity have 
been addressed in [19|. but in this section we again consider asymmetric problems, 
for which the one-way restriction is essential. 

It is easy to see that if player Bob has m input bits then m nondeterministic 
bits are the maximum player Alice needs. Since the nondeterministic communication 
complexity without any limitation on the number of available nondeterministic bits 
is at most m, Alice can just guess the communication and send it to Bob in case it 
is correct with respect to her input and leads to acceptance. Bob can then check the 
same for his input. Thus an optimal protocol can be simulated. 

For the application to lower bounds on formula size we are again interested in 
functions with an asymmetric input partition, i.e., Alice receives much more inputs 
than Bob. For nontrivial results thus the number of nondeterministic bits must be 
smaller than the number of Bob's inputs. 

A second observation is that using s nondeterministic bits can reduce the com- 
munication complexity from the deterministic one-way communication complexity d 
to d/2 s in the best case. If s is sublogarithmic, strong lower bounds follow already 
from the deterministic lower bounds, e.g. N e \ ogn (-^EQ) > n 1_e , while N\ ogn (^EQ) = 
O(logn). On the other hand: 

Lemma 5.1. 

N s (f) = c^N c <c. 

Proof. In a protocol with communication c at most 2 C different messages can be 
sent (for all inputs). To guess such a message c nondeterministic bits are sufficient. 
□ 

It is not sensible to guess more than to communicate. We are interested in de- 
termining how large the difference between nondeterministic one-way communication 
complexity with s nondeterministic bits and unrestricted nondeterministic communi- 
cation complexity may be. Therefore we consider the maximal such gap as a function 
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Corollary 5.2. Let f : {0,1}™ x {0, l} m — > {0,1} be a Boolean function and 
G : IN — > IN a monotone increasing function with 
N(f) = c and N s (f) = G(c) for some s. 

Then N(j-ir n \(f) < c and hence s < G (n), where G^ 1 (x) = mm{y\G(y) > x}. 
Proof. G(c) < n and hence c < G~ l (n). □ 

The range of values of s, for which a gap G between N(f) and N s (f) is possible 
is thus limited. If e.g. an exponential difference G(x) = 2 X holds, then s < \ogn. If 
G(x) — r ■ x, then s < n/r. 

We now show a gap between nondeterministic one-way communication complexity 
with s nondeterministic bits and unlimited nondeterministic communication complex- 
ity. First we define the family of functions exhibiting this gap. 

Definition 5.3. Let D n s be the following Boolean function for 1 < s < n: 



Note that the function has Q(sn log n) input bits in a standard encoding. We 
consider the partition of inputs in which Bob receives the set cc„+i and Alice all other 
sets. The upper bounds in the following lemma are trivial, since Bob only receives 
0(s log n) input bits. 

Lemma 5.4. 



The lower bound we present now results in a near optimal difference between non- 
deterministic (one-way) communication and limited nondeterministic one-way com- 
munication. Limited nondeterministic one-way communication has also been studied 
subsequently to this work in |18j . There a tradeoff between the consumption of non- 
deterministic bits and the one-way communication is demonstrated (i.e., with more 
nondeterminism the communication gradually decreases). Here we describe a funda- 
mentally different phenomenon of a threshold type: nondeterministic bits do not help 
much, until a certain amount of them is available, when quite quickly the optimal 
complexity is attained. For more results of this type see |23) . 

Theorem 5.5. There is a constant e > 0, so that for s < n 



Proof. We have to show that all nondeterministic one-way protocols computing 
D njS with es nondeterministic bits need much communication. 

A nondeterministic one-way protocol with es nondeterministic bits and commu- 
nication c induces a cover of the communication matrix with 2 ts Boolean matrices 
having the following properties: each 1-entry of the communication matrix is a 1- 
entry in at least one of the Boolean matrices, no 0-entry of the communication matrix 




A3i: \{j\j^i;xinxj^®}\>s. 



N (slogn){Dn : s) = 0(S log Tl). 



D B {D n , s ) = 0{s\ogn). 



N es (D n , s ) = ft(nslogn). 
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is a 1-entry in any of the Boolean matrices, furthermore the set of rows appearing 
in those matrices has size at most 2 C . This set of matrices is obtained by fixing 
the nondeterministic bits and taking the communication matrices of the resulting 
deterministic protocols. We show the lower bound from the property that each of 
the Boolean matrices covering the communication matrices uses at most 2 C different 
rows. Thus the lower bound actually even holds for protocols with limited, but public 
nondctcrminism. 

We first construct a submatrix of the communication matrix with some useful 
properties, and then show the theorem for this "easier" problem. 

Partition the universe {1, . . . , n 3 } in n disjoint sets Ui, . . . , U n with |f/j| = n 2 = m. 
Then choose vectors of n size s subsets of the universe, so that the ith subset is from 
Ui. Thus the n subsets of a vector are pairwise disjoint. Now the protocol has to 
determine, whether the set of Bob intersects nontrivially with s sets of Alice. 

We restrict the set of inputs further. There are ( m ) subsets of Ui having size s. We 
choose a set of such subsets so that each pair of them have no more than s/2 common 
elements. To do so we start with any subset and remove all subsets in "distance" at 
most s/2. This continues as long as possible. We get a set of subsets of Ui, whose 
elements have pairwise distance at least s/2. In every step at most ( s y 2 ) (J/2) subsets 
are removed, thus we get at least 




sets. 

As described we draw Alice's inputs as vectors of sets, where the set at position 
i is drawn from the set of subsets of Ui we have just constructed. These inputs are 
identified with the rows of the submatrix of the communication matrix. The columns 
of the submatrix are restricted to elements of U\ U {T} x • • • x U n U {T}, for which 
s positions are occupied, i.e., n — s positions carry the extra symbol T which stands 
for "no element" . Call the constructed submatrix M. 

Now assume there is a protocol computing the restricted problem. Fixing the 
nondeterministic bits induces a deterministic protocol and a matrix M', which covers 
at least l/2 r of the ones of M, where r = es. We now show that such a matrix must 
have many different rows, which corresponds to large communication. 

Each row of M corresponds to a vector of n sets. A position i is called a difference 
position for a pair of such vectors, if they have different sets at position i. According 
to our construction these sets have no more than s/2 elements in common. 

We say a set of rows has k difference positions, if there are k positions ii, ■ ■ ■ , ik, 
so that for each i[ there are two rows in the set for which i\ is a difference position. 

We now show that each row of M' containing "many" ones does not "fit" on many 
rows of M, i.e., contains ones these do not have. Since M' has one-sided error only, 
the rows of M' are either sparse or cover only few rows of M. Observe that each row 
of M has exactly (™)s s ones. 

Lemma 5.6. Let z be a row of M' , appearing several times in M' . The rows of 
M, in whose place in M the row z appears in M' , may have Sn difference positions. 
Then z contains at most 2(™)s s /2 (5s / 6 ones. 

Proof. Several rows of M having Sn difference positions are given, and the ones 
of z occur in all of these rows. Let C be the set of columns/sets being the ones 

in the first such row. All other columns are forbidden and may not be ones in z. 
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A column in C if chosen randomly by choosing s out of n positions and then one 
of s elements for each position. Let k = Ss. We have to show an upper bound on 
the number of ones in z, and we analyze this number as the probability of getting a 
one when choosing a column in C. The probability of getting a one is at most the 
probability that the chosen positions have a nontrivial intersection with less than k/2 
sets Ui at difference positions i (event E) plus the probability of getting a one under the 
condition of event E, following the general formula Prob(A) < Prob(A\E) + Prob(E) . 

We first count the columns in C, which have a nontrivial intersection with at most 
k/2 of the sets Ui at difference positions i. Consider the slightly different experiment 
in which s times independently one of n positions is chosen, hence positions may be 
chosen more than one time. Now expected 5s = k difference positions are chosen. 
Applying Chernov's inequality yields that with probability at most 



at most k/2 difference positions occur. When choosing a random column in C instead, 
this probability is even smaller, since now positions are chosen without repetitions. 
Thus the columns in C, which "hit" less than k/2 difference positions, contribute at 
most 2- 5s / 6 (™)s s ones to z. 

Now consider the columns/sets in C, which intersect at least k/2 of the Ui at 
difference positions i. Such a column/set fits on all the rows, if the element at each 
position not bearing a T lies in the intersection of all sets in the rows at position i. At 
each difference position there are two rows, which hold different sets at that position, 
and those sets have distance s/2. 

Fix an arbitrary set of positions such that at least k/2 difference positions are 
included. The next step of choosing a column in C consists of choosing one of s 
elements for each position. But if a position is a difference position, then at most 
s/2 elements satisfy the condition of lying in the sets held by all the rows at that 
position. Thus the probability of fitting on all the rows is at most 2~ fc / 2 , and at most 
(™)s s /2 fc / 2 such columns can be a one in z. 

Overall only a fraction of 2~ 5s / &+1 of all columns in C can be ones in z. □ 

At least one half of all ones in M' lie in rows containing at least > (™)s s /2 r+1 
ones. Lemma 5.6 tells us that such a row fits only on a set of rows of M having no 
more than Sn difference positions, where r + 1 = Ss/6 — 1. Hence such a row can 
cover at most all the ones in (™)' 5 " rows of M, and therefore only (™) (") s s ones. 

According to (5.1) at least (m/s) sn / 2 (")s s /(2 3s "/ 2 2 r+1 ) ones are covered by such 
rows, hence 



rows are necessary (for e = 1/20 and n> s > 400). □ 

5.2. Quantum one-way communication. Our first goal in this section is to 
prove that the VC-dimension lower bound for randomized one-way protocols (Fact 
2.9) can be extended to the quantum case. To achieve this we first prove a linear 
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lower bound on the bounded error quantum communication complexity of the index 
function IX n , and then describe a reduction from the index function IX 4 to any 
function with VC-dimension d, thus transferring the lower bound. It is easy to see 
that VC(IX n ) = n, and thus the bounded error probabilistic one-way communication 
complexity is large for that function. 

The problem of random access quantum coding has been considered in [2] and 
|32| . In a n, m, e-random access quantum code all Boolean n-bit words x have to be 
mapped to states of m qubits each, so that for i = 1, . . . , n there is an observable, 
so that measuring the quantum code with that observable yields the bit Xi with 
probability 1 — e. The quantum code is allowed to be a mixed state. Nayak has 
shown 

Fact 5.7. For every n,m, e-random access quantum coding m > (1 — H(e))n. 

It is easy to see that the problem of random access quantum coding is equivalent 
to the construction of a quantum one-way protocol for the index function. If there is 
such a protocol, then the messages can serve as mixed state codes, and if there is such 
a code the codewords can be used as messages. We can thus deduce a lower bound 
for IX n in the model of one-way quantum communication complexity without prior 
entanglement. 

We now give a proof, that can also be adapted to the case of allowed prior entan- 
glement. 

Theorem 5.8. Q e (IX n ) > (1 - H(e))n. 
Qr b (IX n ) > (l-H(e))n/2. 

Proof. Let M be the register containing the message sent by Alice, and let X be 
a register holding a uniformly random input to Alice. Then o~xm denotes the state of 
Alice's qubits directly before the message is sent. o~m is the state of a random message. 
Now every bit is decodable with probability 1 — e and thus S(Xi : M) > 1 — H(e) 
for all i. To see this consider S(Xi : M) as the Holevo information of the following 
ensemble: 

x:Xi— 

with probability 1/2 and 

x:Xi — 1 

with probability 1/2, where a%j is the density matrix of the message on input x. 
The information obtainable on Xi by measuring um must be at 1 — H(e) due to 
Fano's inequality Fact 4.2, and thus the Holevo information of the ensemble is at 
least I -H{t), hence S{X l : M) > 1 - H (e). 

But then S(X : M) > (1 — H{e)n (since all Xi are mutually independent). 
S(X : M) < S(M) using lemma 4.12, since X and M are not entangled. Thus the 
number of qubits in M is at least (1 — H(e))n. 

Now we analyze the complexity of IX n in the one-way communication model with 
entanglement. 

The density matrix of the state induced by a uniformly random input on X, 
the message M, and the qubits Ea,Eb containing the prior entanglement in the 
possession of Alice and Bob, is oxme a e b - Here Ea contains those qubits of the 
entangled state Alice keeps, note that some of the entangled qubits will usually belong 
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to M. Tracing out X and Ea we receive a state <Jme B i which is accessible to Bob. 
Now every bit of the string in X is decodable, thus S(Xi : MEg) > 1 — H(e) for all 
i as before. But then also S(X : MEg) > (1 — H(e)n, since all the Xi are mutually 
independent. 

S(X : ME B ) = S{X : E B ) + S(X : M\E B ) < 2S(M) by an application of the 
Araki-Lieb inequality, see Fact 4.11. Note that S(X : Eb) — 0. So the number of 
qubits in M must be at least (1 — H(e))n/2. □ 

Note that the lower bound shows that 2-round deterministic communication com- 
plexity can be exponentially smaller than one-way quantum communication complex- 
ity. For a more general quantum communication round- hierarchy see |26) . 

Theorem 5.9. For all functions f : Q e (f) > (1 - H(e))VC{f) and 

Qe Ub {f) > (1 — H{e))VC(f)/2. 

Proof. We now describe a reduction from the index function to /. Assume 
VC(f) — d, i.e., there is a set S = {s%, . . . , sj} of inputs for Bob, which is shattered 
by the set of functions f(x, .). The reduction then goes from IXd to /. 

For each R C S let cr be the incidence vector of R (having length d). cr is a 
possible input for Alice when computing the index function IXd- For each R choose 
some xr, which separates this subset from the rest of 5, i.e., so that f(xR,y) = 1 for 
all y G R and f(x R , y) = for all y G S - R. 

Assume a protocol for / is given. To compute the index function the players do 
the following. Alice maps cr to xr. Bob's inputs i are mapped to the Sj. Then 
f(x R ,Si) = 1 Si G R Cfl(i) = 1. 

In this manner a quantum protocol for / must implicitly compute IX^. According 
to Theorem 5.8 the lower bounds follow. □ 

Application of the previous theorem gives us lower bounds for the disjointness 
problem in the model of quantum one-way communication complexity. Lower bounds 
of the order f^n 1 ^") for constant k in /c-round protocols are given in |26|. 

Corollary 5.10. Q e (DISJ n ) > (1 - H(e))n. 

QP ub (DISJ n ) > (1 - H(e))n/2. 

The first result has independently been obtained in UJ. Note that the obtained 
lower bound method is not tight in general. There are functions for which an un- 
bounded gap exists between the VC-dimension and the quantum one-way communi- 
cation complexity [2*5], 

Now we turn to the exact and Las Vegas quantum one-way communication com- 
plexity For classical one-way protocols it is known that Las Vegas communication 
complexity is at most a factor 1/2 better than deterministic communication for total 
functions, see Fact 2.10. 

Theorem 5.11. For all total functions f: 

QeU) — D(f), 

QoM) > (i-eW). 

Proof. Let row(f) be the number of different rows in the communication matrix 
of f(x,y). According to Fact 2.7 D(f) = [log row(f)~\ . We assume in the following 
that the communication matrix consists of pairwise different rows only. 

We will show that any Las Vegas one-way protocol which gives up with probability 
at most e > for some function / having row(f) = R, must use messages with von 
Neumann entropy at least (1 — e)logi?, when started on a uniformly random input. 
Inputs for Alice are identified with rows of the communication matrix. We then 
conclude that the Hilbert space of the messages must have dimension at least R l ~ e 
and hence at least (1 — e) logi? qubits have to be sent. This gives us the second lower 
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bound of the theorem. The upper bound of the first statement is trivial, the lower 
bound of the first statement follows by taking e = 0. 

We now describe a process, in which rows of the communication matrix are chosen 
randomly bit per bit. Let p be the probability of having a in column 1 (i.e., the 
number of 0s in column 1 divided by the number of rows). Then a is chosen with 
probability p, a 1 with probability I— p. Afterwards the set of rows is partitioned into 
the set Ig of rows starting with a 0, and the set I\ of rows starting with a 1. When 
x\ = b is chosen, the process continues with lb and the next column. 

Let p y be the density matrix of the following mixed state: the (possibly mixed) 
message corresponding to a row starting with y is chosen uniformly over all such rows. 

The probability, that a is chosen after y is called p y , and the number of different 
rows beginning with y is called row y . 

We want to show via induction that S(p y ) > (1 — e) \ogrow y . Surely S(p y ) > 
for all y. 

Recall that Bob can determine the function value for an arbitrary column with 
the correctness guarantee of the protocol. 

Then with lemma 4.10 S(p y ) > p y S(p y o) + (1 — p y )S(p y i) + (1 — e)H(p y ), and via 
induction 

S{p v )> p y ((l - e) logrow y0 ) 

+ (1 ~ Py)((l - e) logrowyx) + (1 - e)H(p y ) 
= (1 - £)[Py\og(p y row y ) 
+ (1 -p„)log((l -p y )row y ) +H(p y )} 
= (1 — e) log rowy. 

We conclude that S(p) > (1 — e) log row(f) for the density matrix p of a message to a 
uniformly random row. Hence the lower bound on the number of qubits holds. □ 

We now again consider the model with prior entanglement. 

Theorem 5.12. For all total functions f: 

Q% b (f) = \D(f)/2], 

Q^(f)>D(f)(l-e)/2. 

The upper bound follows from superdense coding j 5j . Instead of the lower bounds 
of the theorem we prove a stronger statement. We consider an extended model of 
quantum one-way communication, that will be useful later. 

In a nonstandard one-way quantum protocol Alice and Bob are allowed to com- 
municate in arbitrarily many rounds, i.e., they can exchange many messages. But 
Bob is not allowed to send Alice a message, so that the von Neumann information 
between the input of Alice plus the accessible qubits of Alice and Bob's input is larger 
than 0. The communication complexity of a protocol is the number of qubits sent by 
Alice in the worst case. The model is at least as powerful as the model with prior 
entanglement, since Bob may e.g. generate some EPR-pairs, send one qubit of each 
pair to Alice, then Alice may send a message as in a protocol with prior entanglement. 

Lemma 5.13. For all functions f a nonstandard quantum one-way protocol with 
bounded error must communicate at least (l — H(e))VC(f)/2 qubits from Alice to Bob. 

For all total functions f a nonstandard quantum one-way protocol 

1. with exact acceptance must communicate at least \D(f)/2~\ qubits from Alice 
to Bob. 

2. with Las Vegas acceptance and success probability 1 — e must communicate at 
least (1 — e)D(f)/2 qubits from Alice to Bob. 
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Proof. In this proof we always call the qubits available to Alice P, and the qubits 
available to Bob Q, for simplicity disregarding that these registers change during the 
course of the protocol. We assume that the inputs are in registers A, Y and are never 
erased or changed in the protocol. Furthermore we assume that for all fixed values 
x, y of the inputs the remaining global state is pure. 

For the first statement it is again sufficient to investigate the complexity of the 
index function. 

Let cfxypq be the state for random inputs in X, Y for Alice and Bob, with qubits 
P and Q in the possession of Alice and Bob. Since Bob determines the result, it must 
be true that in the end of the protocol S(Xy ■ YQ) > 1 — H(e), since the value Xy 
can be determined from Bob's qubits with probability 1 — e. It is always true in the 
protocol that S(XP : Y) = 0. Let p p ~ x ' Y ~ v be the density matrix of P for fixed 

inputs X = x and Y — y. Then we have that for all x, y, y': p P ~ x ' Y ~ y = p P ~ xY ~ v . 

Ppq X Y ~ V P ur ifi es Pp~ xY ~ v ■ Then the following fact from [3U| and |2H] tells us 
that all y and corresponding states of Q are "equivalent" from the perspective of 
Alice. 

Fact 5.14. Assume \<pi) and \<f>2) are pure states in a Hilbert space H ®K, so 

thatTr K \<j> 1 )(<p 1 \=Tr K \<i>2)(<j>2\- 

Then there is a unitary transformation U acting on K , so that I (g> U\4>i) = \4>2) 
(for the identity operator I on H). 

Thus there is a local unitary transformation applicable by Bob alone, so that 
PpQ X ' Y ~ V can ^ e changed to P^q X ' Y ~ v Hence for all i we have S(QY : JQ) > 1— H(e), 
and thus S(X : QY) > (1 - H(e))n. 

In the beginning S(X : QY) = 0. Then the protocol proceeds w.l.o.g. so that 
each player applies a unitary transformation on his qubits and then sends a qubit 
to the other player. Since the information cannot increase by local operations, it 
is sufficient to analyze what happens if qubits are sent. When Bob sends a qubit 
to Alice S(X : QY) is not increased. When Alice sends a qubit to Bob, then Q is 
augmented by a qubit M, and S(X : QMY) < S(X : QY) + S(XQY : M) < S(X : 
QY) + 2S(M) < S(X : QY) + 2 due to Fact 4.11. Thus the information can increase 
only when Alice sends a qubit and always by at most 2. The lower bound follows. 

Now we turn to the second part. We consider the same situation as in the proof of 
Theorem 5.11. Let a r p c denote the density matrix of the qubits P in Alice's possession 
under the condition that the input row is r and the input column is c. Clearly o~p C Q 

(containing also Bob's qubits) is a purification of a P c . Again a P c = a P c for all r, c, c', 
and according to Fact 5.14 for all c and all corresponding states of Q, it is true that 
Bob can switch locally between them. Hence it is possible for Bob to compute the 
function for an arbitrary column. 

The probability of choosing a after a prefix y of a row is again called p y , 
and the number of different rows beginning with y is called row y . p y contains the 
state of Bob's qubits at the end of the protocol if a random row starting with y 
is chosen uniformly (and some fixed column c is chosen). Surely S(p y ) > for all 
y. Since Bob can change his column (and the corresponding state of Q) by a local 
unitary transformation, he is able to compute the function for an arbitrary column, 
always with the success probability of the protocol, at the end. With lemma 4.10 
S{p y ) > PyS(p y o) + (1 - p y )S{p y i) + (1 - e)H(p y ). 

At the end of the protocol thus S(a^) = S(p) > (1-e) logrow(f)+J2 r 7^j)^( a Q) 
for all c. Thus the Holevo information of the ensemble, in which p r = o~ t q is chosen 
with probability \/row(f) is at least (1 — e) log row(/). Let o-rpq be the density 
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matrix of rows, qubits of Alice and Bob. It follows that S(R : Q) > (1 — e) log row(f) 
and as before at least half that many qubits have to be sent from Alice to Bob. □ 

6. More lower bounds on formula size. 

6.1. Nondeterminism and formula size. Let us first mention that any nondc- 
terministic circuit can easily be transformed into a nondctcrministic formula without 
increasing size by more than a constant factor. To do so one simply guesses the values 
of all gates and then verifies that all guesses are correct and that the circuit accepts. 
This is a big AND over test involving O(l) variables, which can be implemented by a 
CNF each. Hence lower bounds for nondeterministic formulae are very hard to prove, 
since even nonlinear lower bounds for the size of deterministic circuits computing 
some explicit functions are unknown. We now show that formulae with limited non- 
determinism are more accessible. We start by introducing a variant of the Neciporuk 
method, this time with nondeterministic communication: 

Definition 6.1. Let f be a Boolean function with n input variables and y\ . . .yk 
be a partition of the inputs in k blocks. 

Player Bob receives the inputs in yi and player Alice receives all other inputs. 
The nondeterministic one-way communication complexity of with s nondeterministic 
bits of f under this input partition is called N s (fi). Define the s -nondeterministic 
Neciporuk function as 1/4 X^—i N s (fi). 

Lemma 6.2. The s -nondeterministic Neciporuk function is a lower bound for the 
length of nondeterministic Boolean formulae with s nondeterministic bits. 

The proof is analogous to the proof of Theorem 3.5. Again protocols simulate the 
formula in k communication games. This time Alice fixes the nondctcrministic bits 
by herself, and no probability distribution on formulae is present. 

We will apply the above methodology to the following language. 

Definition 6.3. Let AD ns denote the following language (for 1 < s < n): 

AD n , s = {(xi, x n+1 )\Vi : Xi € V{n 3 , s), 
Xi is written in sorted order 
A3i:\{j\j ^i^iHXj ^0}| > s}. 

Theorem 6.4. Every nondeterministic formula with s nondeterministic bits for 
AD nt 20s has length at least il(n 2 s\ogn). 

AD n s can be computed by a nondeterministic formula of length 0(ns 2 logn), 
which uses 0(s\ogn) nondeterministic bits (for s > \ogn). 

Proof. For the lower bound we use the methodology we have just described. We 
consider the n + 1 partitions of the inputs, in which Bob receives the set Xi and Alice 
all other sets. The function they have to compute now is the function D n s from 
definition 5.3. In Theorem 5.5 a lower bound of 0(ns logn) is shown for this problem, 
hence the length of the formula is Q(n ■ nslogn). 

For the upper bound we proceed as follows: the formula guesses (in binary) a 
number i with 1 < i < n + 1 and pairs (ji,Wi), . . . , (j s ,w s ), where 1 < jk < n + 1 
and 1 < Wk < n 3 for all k = 1, . . . , s. The number i indicates a set, and the pairs are 
witnesses that set i and set jk intersect on element Wk- 

The formula does the following tests. First there is a test, whether all sets consist 
of s sorted elements. For this ns comparisons of the form x{ < xj + suffice, which 
can be realized with 0(log 2 n) gates each. Since s > logn overall 0(ns 2 logn) gates 
are enough. 
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The next test is, whether ji < ■ ■ ■ < j s - This makes sure that witnesses for s 
different sets have been guessed. Also i ^ for all k must be tested. 

Then the formula tests, whether for all 1 < I < n + 1 the following holds: if I = i, 
then all guessed elements are in x\ ; if 1 < I < n + 1 and 1 < k < s the formula also 
tests, whether I — implies, that Wk € %i- 

All these test can be done simultaneously by a formula of length 0{ns 2 log n). 

□ 

For < e < 1/2 let s = n 1 ^ , then the lower bound for limited nondetermin- 
istic formulae is Q(N 2 ~ e / log 1-6 N) with N e /\og e N nondeterministic bits allowed. 
0(N e log 1-6 N) nondeterministic bits suffice to construct a formulae having length 
0(N 1+e / log 6 N). Hence the threshold for constructing an efficient formula is poly- 
nomially large, allowing an exponential number of computations on each input. 

6.2. Quantum formulae. Now we derive lower bound for generalized quantum 
formulae. In 38 pure quantum formulae are considered (recall these are quantum 
formulae which may not access multiply readable random bits). The result is as 
follows. 

Fact 6.5. Every pure quantum formula computing a function f with bounded 
error has length 



for the Neciporuk function D(fi), see Fact 3.1 and definition 3.4-. 

Furthermore in |38| it is shown that pure quantum formulae can be simulated 
efficiently by deterministic circuits. 

Now we know from §3.2 that the Boolean function MP with 0(n 2 ) inputs (the 
matrix product function) has fair probabilistic formulae of linear size 0(n 2 ), while 
the Neciporuk bound is cubic (theorems 3.11 and 3.12). Thus we get the following. 

Corollary 6.6. There is a Boolean function MP with N inputs, which can be 
computed by fair Monte Carlo formulae of length O(N), while every pure quantum 
formula with bounded error for MP has size il(N 3 / 2 / log N). 

We conclude that pure quantum formulae are not a proper generalization of classi- 
cal formulae. A fair probabilistic formula can be simulated efficiently by a generalized 
quantum formula on the other hand. We now derive a lower bound method for gen- 
eralized quantum formulae. First we give again a lower bound in terms of one-way 
communication complexity, then we show that the VC-Neciporuk bound is a lower 
bound, too. 

This implies with Theorem 3.9 that the maximal difference between the sizes 
of deterministic formulae and generalized bounded error quantum formulae provable 
with the Neciporuk method is at most 0(^/n). 

But first let us conclude the following corollary, which states that fair probabilistic 
formulae reading their random bits only once are sometimes inefficient. 

Corollary 6.7. The (standard) Neciporuk function divided by logn is an 
asymptotical lower bound for the size for fair probabilistic formulae reading their ran- 
dom inputs only once. 

Proof. We have to show that pure quantum formulae can simulate these special 
probabilistic formulae. For each random input we use two qubits in the state |00). 
These are transformed into the state |$ + ) by a Hadamard gate. One of the qubits 
is never used again, then the other qubit has the density matrix of a random bit. 
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Then the probabilistic formula can be simulated. For the simulation of gates unitary 
transformations on three qubits are used. These get the usual inputs of the gate 
simulated plus one empty qubit as input, which after the application of the gate 
carries the output. These gates are easily constructed unitarily. According to |3] each 
3 qubits gate can be composed of 0(1) unitary gates on 2 qubits only. □ 
We will need the following observation pQ. 

Fact 6.8. // the density matrix of two qubits in a circuit (with nonentangled 
inputs) is not the tensor product of their density matrices, then there is a gate so that 
both qubits are reachable on a path from that gate. 

Since the above situation is impossible in a formula, the inputs to a gate are never 
entangled. 

The first lower bound is stated in terms of one-way communication complexity. 
It is interesting that actually randomized complexity suffices for a lower bound on 
quantum formulae. 

Theorem 6.9. Let f be a Boolean function on n inputs and y\. . .yu a partition 
of the input variables in k blocks. Player Bob knows the inputs in yi and player 
Alice knows all other inputs. The randomized (private coin) one-way communication 
complexity of f (with bounded error) under this input partition is called R{fi). 

Every generalized quantum formula for f with bounded error has length 

Proof. For a given partition of the input we show how a generalized quantum 
formula F can be simulated in the k communication games, so that the randomized 
one-way communication in game i is bounded by a function of the number of leaves 
in a subtree Fi of F. F{ contains exactly the variables belonging to Bob as leaves and 
its root is the root of F. Furthermore F, contains all gates on paths from these leaves 
to the root. Note that the additional nonentangled mixed state which the formula 
may access is given to Alice. 

F is a tree of fan-in 2 fan-out 1 superoperators (recall that superoperators are 
not necessarily reversible). "Wires" between the gates carry one qubit each. F{ is a 
formula that Bob wants to evaluate, the remaining parts of the formula F belong to 
Alice, and she can easily compute the density matrices for all qubits on any wire in 
her part of the formula by a classical computation, as well as the density matrices 
for the qubits crossing to Bob's formula Fi. Note that none of the qubits on wires 
crossing to Fi is entangled with another, so the state of these qubits is a probabilistic 
ensemble of pure nonentangled states. Hence Alice may fix a pure nonentangled state 
from this ensemble with a randomized choice. 

In all communication games Bob evaluates the formula as far a possible without 
the help of Alice. By an argument as in other Neciporuk methods (e.g. [TUSH! or the 
previous sections) it is sufficient to send few bits from Alice to Bob to evaluate a path 
with the following property: all gates on the path have one input from Alice and one 
input from it predecessor, except of the first gate, which has one input from Alice, 
and one (already known) input from Bob. With standard arguments the number of 
such paths is a lower bound on the number of leaves in the subformula, see §3.1. 

Hence we have to consider some path gi,... ,g m in F, where g\ has one input 
or a gate from Alice as predecessor and and input or gate from Bob as the other 
predecessor, and all gates gi have the previous gate gi-i and an input or gate from 
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Alice's part of the formula as predecessors. The density matrix of Bob's input to to 
<7i is called p, and the density matrix of the other m inputs is called a. The circuit 
computing a works on different qubits than the circuit computing p. 

Thus the density matrix of all inputs to the path is p®a, see Fact 6.8. The path 
maps p <g> a with a superoperator T to a density matrix \i on one qubit, altenatively 
we may view a as determining a superoperator T CT on one qubit that has to be applied 
to p. Now Alice can compute this superoperator by herself, classically. 

Bob knows p. Bob wants to know the state T a p. Since this operator works on 
a single qubit only it can be described within a precision l/poly(k) by a constant 
size matrix containing numbers of size 0(logk) for any integer k. Thus Alice may 
communicate T CT to Bob within this precision using O(logfc) bits. 

In this way Alice and Bob may evaluate the formula, and the error of the for- 
mula is changed only by sizei/poly(k) compared to the error of the quantum for- 
mula, when sizei denotes the number of gates in Fi. Thus choosing k — poly(sizei) 
the communication is bounded R(fi) < 0{sizei log sizei). This implies sizei > 
fl(R(fi)/ log R(fi)). Summation over all i yields the theorem. □ 

The above construction loses a logarithmic factor, but in the combinatorial bounds 
we actually apply, we can avoid this, by using quantum communication and the pro- 
grammable quantum gate from Fact 4.20. 

Theorem 6.10. The VC- Neciporuk function is an asymptotical lower bound for 
the length of generalized quantum formulae with bounded error. 

The Neciporuk function is an asymptotical lower bound for the length of general- 
ized quantum Las Vegas formulae. 

Proof. We proceed similar to the above construction, but Alice and Bob use 
quantum computers. Instead of communicating a superoperator in matrix form with 
some precision we use the programmable quantum gate. 

Alice and Bob cooperatively evaluate the formulae Fi in a communication game as 
before. As before, for certain paths Alice wants to help Bob to apply a superoperator 
T G on a state p of his. Using Kraus representations (Fact 4.7) we can assume that 
this is a unitary operator on 0(1) qubits (one of them p, the others blank) followed 
by throwing away all but one of the qubits. 

This time Alice sends to Bob the program corresponding to the unitary operation 
in T a . Bob feeds this program into the programmable quantum gate, which tries to 
apply the transformation, and if this is successful the formula evaluation can continue 
after discarding the unnecessary qubits. This happens with probability fi(l). If Alice 
could get some notification from Bob saying whether the gate has operated successfully 
and if not, what kind of error occurred, then Alice could send him another program 
that both undoes the error and the previous operator and then makes another attempt 
to compute the desired operator. 

Note that the error that resulted by an application of the programmable quantum 
gate is determined by the classical measurement outcome resulting in its application. 
Furthermore this error can be described by a unitary transformation itself. If the 
error function is E, the desired is unitary is U, and the state it has to be applied to is 
p, then Bob now holds UEpE^U^ . Once Alice knows E (which is determined by Bob's 
measurement outcome), Alice can produce a program for UE^U^ . If Bob applies this 
transformation successfully they are done, otherwise they can iterate. Note that only 
an expected number of 0(1) such iterations are necessary, and hence the expected 
quantum communication in this process is 0(1), too. 

So the expected communication can be reduced to O(sizei). But Alice needs 
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some communication from Bob. Luckily this communication does not reveal any 
information about Bob's input: Bob's measurement outcomes are random numbers 
without correlation with his input. 

So we consider the nonstandard one-way communication model from lemma 5.13, 
in which Bob may talk to Alice, but without revealing any information about his 
input. Using this model in the construction and letting Bob always ask explicitly 
for more programs reduces the communication in game i to O(sizei) in the expected 
sense. 

With lemma 5.13 we get the lower bounds for bounded error and Las Vegas 
communication. □ 

Now we can give a lower bound for ISA showing that even generalized quan- 
tum formulae compute the function not significantly more efficient than deterministic 
formulae. 

Corollary 6.11. Every generalized quantum formula, which computes ISA with 
bounded error has length Q,(n 2 / logn). 

Considering the matrix multiplication function MP we get the following. 

COROLLARY 6.12. There is a function, which can be computed by a generalized 
quantum formula with bounded error as well as by a fair probabilistic formula with 
bounded error, with size O(N). Every generalized quantum Las Vegas formula needs 
size Q(N 3 / 2 ) for this task. Hence there is a size gap of Q^N 1 / 2 ) between Las Vegas 
formula length and the length of bounded error formulae. 

Since the VC-Neciporuk function is a lower bound for generalized quantum formu- 
lae, Theorem 3.9 implies that the maximal size gap between deterministic formulae 
and generalized quantum formulae with bounded error provable by the (standard) 
Neciporuk method is 0(s/n) for input length n. Such a gap actually already lies be- 
tween generalized quantum Las Vegas formulae and fair probabilistic formulae with 
bounded error. 

7. Conclusions. In this paper we have derived lower bounds for the sizes of 
probabilistic, nondeterministic, and quantum formulae. These lower bounds follow 
the general approach of reinterpreting the Neciporuk bound in terms of one-way com- 
munication complexity. This is nontrivial in the case of quantum formulae, where 
we had use a programmable quantum gate. Nevertheless we have obtained the same 
combinatorial lower bound for quantum and probabilistic formulae based on the VC- 
dimension. 

Using the lower bound methods we have derived a general y/n gap between 
bounded error and Las Vegas formula size. Another result is a threshold phenomenon 
for the amount of nondeterminism needed to compute a function, which gives a near 
-quadratic size gap for a polynomial threshold on the number of nondeterministic bits. 

To derive our results we needed lower bounds for one-way communication com- 
plexity. While these were available in the case of probabilistic one-way communication 
complexity, we had to develop these lower bounds in the quantum and nondetermin- 
istic case. These results give gaps between 2-round and one-way communication 
complexity in these models. Those gaps have been generalized to round hierarchies 
for larger number of rounds in 21|j and |26| for the nondeterministic resp. the quan- 
tum case. Furthermore we have shown that quantum Las Vegas one-way protocols 
for total functions are not much more efficient than deterministic one-way protocols. 
The lower bounds for quantum one-way communication complexity are also useful 
to give lower bounds for quantum automata, and for establishing that only bounded 
error quantum finite automata can be exponentially smaller than deterministic finite 
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automata |24|. A generalization of the VC-dimension bound on quantum one-way 
communication complexity is given in |25) . 

We single out the following open problems: 

1. Give a better separation between deterministic and probabilistic/quantum 
formula size (see |23 for a candidate function). 

2. Separate the size complexities of generalized quantum and probabilistic for- 
mulae for some function. 

3. Investigate the power of quantum formulae that can access an entangled state 
as an additional input, thus introducing entanglement into the model. 

4. Separate quantum and probabilistic one-way communication complexity for 
some total function or show that both are related. 

5. Prove super-quadratic lower bounds for formulae over the basis of all two-ary 
Boolean functions. 
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